RNA-seq analysis for Jagesh Shah group at Longwood.

Contact Lorena Pantano (lpantano@hsph.harvard.edu) for additional details.

The most recent update of this html document occurred: Thu Feb 2 16:54:53 2017

The sections below provide code to reproduce the included results and plots.

## Warning: package 'knitr' was built under R version 3.3.2

Overview

2017-02-02 16:55:01 INFO::Using gene counts calculated from the Sailfish transcript counts.

Differential expression

Dispersion estimates

Comparison: mice_model_jck_wt


out of 28948 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 5626, 19%
LFC < 0 (down) : 4691, 16%
outliers [1] : 0, 0%
low counts [2] : 6665, 23%
(mean count < 1)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results

NULL

Differential expression file at: mice_model_jck_wt_de.csv

Normalized counts matrix file at: mice_model_jck_wt_log2_counts.csv

MA plot plot

Volcano plot

QC for DE genes p-values/variance

Most significand, FDR< 0.1 and log2FC > 0 : 10317

Plots most significand

Plot top 9 genes

Top DE genes

baseMean log2FoldChange lfcSE stat pvalue padj absMaxLog2FC
ENSMUSG00000014813 173.6603 2.1507312 0.1068010 20.13775 0 0 2.1507312
ENSMUSG00000038071 259.2756 3.4667747 0.1951606 17.76370 0 0 3.4667747
ENSMUSG00000036853 610.5610 2.0357935 0.1305086 15.59893 0 0 2.0357935
ENSMUSG00000046546 994.8930 1.0829969 0.0700300 15.46476 0 0 1.0829969
ENSMUSG00000022123 270.4489 1.2116727 0.0872539 13.88675 0 0 1.2116727
ENSMUSG00000026822 3151.2868 4.3673037 0.3159933 13.82087 0 0 4.3673037
ENSMUSG00000039405 1055.6664 0.9514464 0.0691527 13.75863 0 0 0.9514464
ENSMUSG00000031390 731.0540 0.9262546 0.0679541 13.63058 0 0 0.9262546
ENSMUSG00000048834 102.9005 -1.6088336 0.1188446 -13.53729 0 0 1.6088336
ENSMUSG00000040405 1252.0868 4.6338388 0.3496244 13.25376 0 0 4.6338388
ENSMUSG00000053113 491.0256 3.5768235 0.2715283 13.17293 0 0 3.5768235
ENSMUSG00000001131 244.7676 4.3056384 0.3344138 12.87518 0 0 4.3056384
ENSMUSG00000031904 196.7504 1.1868056 0.0927694 12.79308 0 0 1.1868056
ENSMUSG00000029811 543.3703 2.7326751 0.2140976 12.76369 0 0 2.7326751
ENSMUSG00000085180 128.8999 1.4768996 0.1175146 12.56780 0 0 1.4768996
ENSMUSG00000050335 1880.2186 1.8094092 0.1444747 12.52406 0 0 1.8094092
ENSMUSG00000061947 410.6397 3.7060789 0.2958967 12.52491 0 0 3.7060789
ENSMUSG00000028364 460.6414 1.9161503 0.1602581 11.95665 0 0 1.9161503
ENSMUSG00000057751 129.0649 3.1432037 0.2646103 11.87861 0 0 3.1432037
ENSMUSG00000074653 146.4168 1.3726445 0.1160311 11.82997 0 0 1.3726445

Here, I only considered the condition to get the DE genes. These will contain genes where the mean on each condition is different, and will help to detect genes that are always UP or DOWN.

Comparison: mice_model_late


out of 29001 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up) : 915, 3.2%
LFC < 0 (down) : 1189, 4.1%
outliers [1] : 10, 0.034%
low counts [2] : 12667, 44%
(mean count < 8)
[1] see ‘cooksCutoff’ argument of ?results
[2] see ‘independentFiltering’ argument of ?results

NULL

Differential expression file at: mice_model_late_de.csv

Normalized counts matrix file at: mice_model_late_log2_counts.csv

MA plot plot

Volcano plot

QC for DE genes p-values/variance

Most significand, FDR< 0.05 and log2FC > 0.5 : 1089

Plots most significand

Plot top 9 genes

Top DE genes

baseMean log2FoldChange lfcSE stat pvalue padj symbol description absMaxLog2FC
ENSMUSG00000029304 114749.89505 -3.3671436 0.3772822 86.30123 0 0.0e+00 Spp1 secreted phosphoprotein 1 3.3671436
ENSMUSG00000052392 633.69960 1.7788144 0.2096200 72.09577 0 0.0e+00 Acot4 acyl-CoA thioesterase 4 1.7788144
ENSMUSG00000022037 18586.52065 -2.6294328 0.3702635 68.51329 0 0.0e+00 Clu clusterin 2.6294328
ENSMUSG00000022010 12995.60801 -2.0083478 0.2540145 66.03906 0 0.0e+00 Tsc22d1 TSC22 domain family, member 1 2.0083478
ENSMUSG00000023019 10901.79058 1.3714163 0.1997442 64.40358 0 0.0e+00 Gpd1 glycerol-3-phosphate dehydrogenase 1 (soluble) 1.3714163
ENSMUSG00000058952 352.58400 -2.1009773 0.3293173 64.27045 0 0.0e+00 Cfi complement component factor i 2.1009773
ENSMUSG00000021228 1129.42917 2.1216502 0.2994171 60.95581 0 0.0e+00 Acot3 acyl-CoA thioesterase 3 2.1216502
ENSMUSG00000002565 2610.44797 -1.3796893 0.1852228 59.53631 0 0.0e+00 Scin scinderin 1.3796893
ENSMUSG00000024479 2033.08115 -1.3939719 0.1922656 58.31450 0 0.0e+00 Mal2 mal, T cell differentiation protein 2 1.3939719
ENSMUSG00000029811 543.37028 -2.6294878 0.4654234 55.78897 0 0.0e+00 Aoc1 amine oxidase, copper-containing 1 2.6294878
ENSMUSG00000015852 114.67300 -3.1380608 0.6290703 52.75188 0 1.0e-07 Fcrls Fc receptor-like S, scavenger receptor 3.1380608
ENSMUSG00000029484 581.93663 -1.8376379 0.2676036 52.32848 0 2.0e-07 Anxa3 annexin A3 1.8376379
ENSMUSG00000031482 2486.60168 0.9853763 0.1641964 51.00154 0 3.0e-07 Slc25a15 solute carrier family 25 (mitochondrial carrier ornithine transporter), member 15 0.9853763
ENSMUSG00000053063 56.71267 -3.6571623 0.6138824 48.74406 0 8.0e-07 Clec12a C-type lectin domain family 12, member a 3.6571623
ENSMUSG00000038642 1086.93525 -2.7162279 0.4504279 47.95844 0 1.0e-06 Ctss cathepsin S 2.7162279
ENSMUSG00000026255 2411.40769 0.8457319 0.1548358 47.16852 0 1.4e-06 Efhd1 EF hand domain containing 1 0.8457319
ENSMUSG00000033860 1758.48435 -3.0693263 0.4946513 47.10698 0 1.4e-06 Fgg fibrinogen gamma chain 3.0693263
ENSMUSG00000036594 3672.02577 -2.1823271 0.3872259 46.94062 0 1.4e-06 H2-Aa histocompatibility 2, class II antigen A, alpha 2.1823271
ENSMUSG00000035649 313.74368 -0.3660899 0.2337200 46.60849 0 1.6e-06 Zcchc7 zinc finger, CCHC domain containing 7 0.3660899
ENSMUSG00000037348 726.05890 0.9785023 0.1854403 46.07993 0 1.9e-06 Paqr7 progestin and adipoQ receptor family member VII 0.9785023

GO ontology of DE genes (log2FC > 0.5 and FDR < 0.05 ): 1089

ID Description GeneRatio BgRatio pvalue p.adjust qvalue
GO:0050900 GO:0050900 leukocyte migration 48/991 263/20996 0.0000000 0.0000000 0.0000000
GO:0050865 GO:0050865 regulation of cell activation 63/991 467/20996 0.0000000 0.0000000 0.0000000
GO:0002250 GO:0002250 adaptive immune response 50/991 345/20996 0.0000000 0.0000000 0.0000000
GO:0044282 GO:0044282 small molecule catabolic process 40/991 246/20996 0.0000000 0.0000000 0.0000000
GO:0051186 GO:0051186 cofactor metabolic process 43/991 322/20996 0.0000000 0.0000001 0.0000001
GO:0019884 GO:0019884 antigen processing and presentation of exogenous antigen 13/991 33/20996 0.0000000 0.0000001 0.0000001
GO:0071345 GO:0071345 cellular response to cytokine stimulus 51/991 448/20996 0.0000000 0.0000005 0.0000004
GO:0001819 GO:0001819 positive regulation of cytokine production 44/991 357/20996 0.0000000 0.0000005 0.0000004
GO:0030198 GO:0030198 extracellular matrix organization 31/991 208/20996 0.0000000 0.0000010 0.0000007
GO:0006897 GO:0006897 endocytosis 52/991 475/20996 0.0000000 0.0000011 0.0000008
GO:0044236 GO:0044236 multicellular organism metabolic process 19/991 95/20996 0.0000001 0.0000044 0.0000033
GO:0009636 GO:0009636 response to toxic substance 24/991 153/20996 0.0000002 0.0000108 0.0000080
GO:1990267 GO:1990267 response to transition metal nanoparticle 16/991 78/20996 0.0000006 0.0000251 0.0000187
GO:1901615 GO:1901615 organic hydroxy compound metabolic process 45/991 440/20996 0.0000010 0.0000388 0.0000288
GO:0010038 GO:0010038 response to metal ion 27/991 204/20996 0.0000013 0.0000489 0.0000364
GO:0043410 GO:0043410 positive regulation of MAPK cascade 44/991 434/20996 0.0000017 0.0000596 0.0000443
GO:0042493 GO:0042493 response to drug 27/991 216/20996 0.0000039 0.0001225 0.0000910
GO:0015711 GO:0015711 organic anion transport 37/991 356/20996 0.0000062 0.0001807 0.0001343
GO:0048146 GO:0048146 positive regulation of fibroblast proliferation 13/991 65/20996 0.0000089 0.0002461 0.0001829
GO:0051259 GO:0051259 protein oligomerization 46/991 493/20996 0.0000090 0.0002461 0.0001829
GO:0006837 GO:0006837 serotonin transport 7/991 21/20996 0.0000332 0.0007227 0.0005371
GO:0072593 GO:0072593 reactive oxygen species metabolic process 26/991 231/20996 0.0000379 0.0008031 0.0005968
GO:0072331 GO:0072331 signal transduction by p53 class mediator 16/991 114/20996 0.0000897 0.0016539 0.0012291
GO:0097191 GO:0097191 extrinsic apoptotic signaling pathway 25/991 230/20996 0.0000937 0.0017131 0.0012731
GO:0015748 GO:0015748 organophosphate ester transport 12/991 71/20996 0.0001116 0.0019528 0.0014512
GO:0051092 GO:0051092 positive regulation of NF-kappaB transcription factor activity 15/991 106/20996 0.0001342 0.0022214 0.0016508
GO:0006766 GO:0006766 vitamin metabolic process 11/991 63/20996 0.0001583 0.0025181 0.0018714
GO:0097006 GO:0097006 regulation of plasma lipoprotein particle levels 10/991 54/20996 0.0001889 0.0029333 0.0021799
GO:1901342 GO:1901342 regulation of vasculature development 25/991 243/20996 0.0002232 0.0032958 0.0024493
GO:0007229 GO:0007229 integrin-mediated signaling pathway 12/991 77/20996 0.0002470 0.0035781 0.0026591
GO:0006081 GO:0006081 cellular aldehyde metabolic process 10/991 57/20996 0.0002999 0.0041588 0.0030907
GO:0006577 GO:0006577 amino-acid betaine metabolic process 5/991 14/20996 0.0003248 0.0043442 0.0032284
GO:0010998 GO:0010998 regulation of translational initiation by eIF2 alpha phosphorylation 5/991 14/20996 0.0003248 0.0043442 0.0032284
GO:0051051 GO:0051051 negative regulation of transport 40/991 477/20996 0.0003322 0.0044306 0.0032927
GO:0051495 GO:0051495 positive regulation of cytoskeleton organization 20/991 181/20996 0.0003657 0.0048056 0.0035713
GO:0018108 GO:0018108 peptidyl-tyrosine phosphorylation 28/991 296/20996 0.0003916 0.0050876 0.0037809
GO:0006968 GO:0006968 cellular defense response 5/991 15/20996 0.0004683 0.0056929 0.0042307
GO:0001503 GO:0001503 ossification 33/991 377/20996 0.0005094 0.0061258 0.0045524
GO:0044089 GO:0044089 positive regulation of cellular component biogenesis 35/991 409/20996 0.0005298 0.0063001 0.0046819
GO:0043648 GO:0043648 dicarboxylic acid metabolic process 12/991 84/20996 0.0005615 0.0065952 0.0049013
GO:1901264 GO:1901264 carbohydrate derivative transport 8/991 41/20996 0.0005701 0.0066395 0.0049342
GO:0050673 GO:0050673 epithelial cell proliferation 32/991 364/20996 0.0005711 0.0066395 0.0049342
GO:0060390 GO:0060390 regulation of SMAD protein import into nucleus 5/991 16/20996 0.0006549 0.0074229 0.0055164
GO:2000351 GO:2000351 regulation of endothelial cell apoptotic process 8/991 42/20996 0.0006755 0.0075809 0.0056338
GO:0008360 GO:0008360 regulation of cell shape 15/991 123/20996 0.0006902 0.0077269 0.0057423
GO:0002507 GO:0002507 tolerance induction 6/991 25/20996 0.0008903 0.0092323 0.0068611
GO:0002643 GO:0002643 regulation of tolerance induction 5/991 17/20996 0.0008919 0.0092323 0.0068611
GO:0015695 GO:0015695 organic cation transport 5/991 17/20996 0.0008919 0.0092323 0.0068611

This will detect genes that change differently over time in the two conditions.

Clustering in common patterns

We used diana function inside cluster R package to separate genes using the expression correlation with time. Clusters with more than 3 genes are shown. Significant genes were those with log2FC bigger than 0.1 and FDR < 5%. The file with the information of this analysis is clusters_genes.tsv.

Using expression

Working with 1089 genes

Working with 1037 genes after filtering

Using fold change with two conditions



Working with  1089  genes 



 Working with  907 genes after filtering

Using fold change at each time point



Working with  1089  genes 



 Working with  1059 genes after filtering

R Session Info

(useful if replicating these results)

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.1 (Sierra)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    methods   stats     graphics  grDevices utils    
[8] datasets  base     

other attached packages:
 [1] vsn_3.40.0                 DEGreport_1.11.2          
 [3] quantreg_5.29              SparseM_1.72              
 [5] dplyr_0.5.0                cluster_2.0.5             
 [7] org.Mm.eg.db_3.3.0         AnnotationDbi_1.34.4      
 [9] clusterProfiler_3.0.5      DOSE_2.10.7               
[11] gridExtra_2.2.1            logging_0.7-103           
[13] tximport_1.0.3             DESeq2_1.12.4             
[15] SummarizedExperiment_1.2.3 Biobase_2.32.0            
[17] GenomicRanges_1.24.3       GenomeInfoDb_1.8.7        
[19] IRanges_2.6.1              S4Vectors_0.10.3          
[21] BiocGenerics_0.18.0        pheatmap_1.0.8            
[23] CHBUtils_0.1               edgeR_3.14.0              
[25] limma_3.28.21              gplots_3.0.1              
[27] reshape_0.8.6              ggplot2_2.2.0             
[29] myRfunctions_0.1           knitr_1.15.1              
[31] rmarkdown_1.1             

loaded via a namespace (and not attached):
 [1] bitops_1.0-6          matrixStats_0.51.0    RColorBrewer_1.1-2   
 [4] tools_3.3.1           affyio_1.42.0         R6_2.2.0             
 [7] rpart_4.1-10          KernSmooth_2.23-15    Hmisc_3.17-4         
[10] DBI_0.5-1             lazyeval_0.2.0        colorspace_1.2-7     
[13] nnet_7.3-12           preprocessCore_1.34.0 Nozzle.R1_1.1-1      
[16] chron_2.3-47          graph_1.50.0          labeling_0.3         
[19] topGO_2.24.0          caTools_1.17.1        scales_0.4.1         
[22] affy_1.50.0           readr_1.0.0           genefilter_1.54.2    
[25] stringr_1.1.0         digest_0.6.10         foreign_0.8-67       
[28] XVector_0.12.1        htmltools_0.3.5       highr_0.6            
[31] RSQLite_1.0.0         BiocInstaller_1.22.3  BiocParallel_1.6.6   
[34] gtools_3.5.0          acepack_1.3-3.3       GOSemSim_1.30.3      
[37] RCurl_1.95-4.8        magrittr_1.5          GO.db_3.3.0          
[40] Formula_1.2-1         Matrix_1.2-7.1        Rcpp_0.12.7          
[43] munsell_0.4.3         stringi_1.1.2         yaml_2.1.14          
[46] zlibbioc_1.18.0       plyr_1.8.4            qvalue_2.4.2         
[49] grid_3.3.1            gdata_2.17.0          DO.db_2.9            
[52] lattice_0.20-34       splines_3.3.1         annotate_1.50.1      
[55] locfit_1.5-9.1        igraph_1.0.1          geneplotter_1.50.0   
[58] reshape2_1.4.1        codetools_0.2-15      XML_3.98-1.4         
[61] evaluate_0.10         latticeExtra_0.6-28   data.table_1.9.6     
[64] MatrixModels_0.4-1    gtable_0.2.0          tidyr_0.6.0          
[67] assertthat_0.1        xtable_1.8-2          coda_0.19-1          
[70] survival_2.39-5       tibble_1.2            GSEABase_1.34.1